Skip to content

Add dartboard summarize subcommand#157

Open
MSpencer87 wants to merge 6 commits intorancher:mainfrom
MSpencer87:feature-summarize
Open

Add dartboard summarize subcommand#157
MSpencer87 wants to merge 6 commits intorancher:mainfrom
MSpencer87:feature-summarize

Conversation

@MSpencer87
Copy link
Contributor

Implements the dartboard summarize CLI subcommand used to gather:

  • Metrics
  • Profiles
  • Resource Counts

PR for #116

Please review carefully. Copilot and Gemini were used to assist with bash script conversions

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a new dartboard summarize CLI subcommand that gathers metrics, profiles, and resource counts from deployed clusters. The implementation includes three standalone Go tools (collect-profile, export-metrics, and resource-counts) that are built and executed by the summarize command. The PR also adds functionality to update the monitoring project annotation during deployment.

Changes:

  • Added dartboard summarize subcommand to collect diagnostics from clusters
  • Implemented three new diagnostic tools in Go with corresponding build scripts
  • Updated deployment process to annotate cattle-monitoring-system namespace with project ID
  • Added unused GetProject function to kubectl package

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 24 comments.

Show a summary per file
File Description
cmd/dartboard/main.go Registers the new summarize subcommand
cmd/dartboard/subcommands/summarize.go Implements summarize logic: builds and runs diagnostic tools
cmd/dartboard/subcommands/deploy.go Adds updateMonitoringProject function to annotate monitoring namespace
internal/kubectl/kubectl.go Adds GetProject helper function (unused)
summarize-tools/collect-profile/* Go tool and build script for collecting heap/CPU profiles
summarize-tools/export-metrics/* Go tool, build script, and pod manifest for exporting Prometheus metrics
summarize-tools/resource-counts/* Go tool, build script, and legacy bash script for counting Kubernetes resources
.gitignore Adds patterns to ignore generated binaries and result directories

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +10 to +20
containers:
- name: mimirtool
image: grafana/mimirtool:2.13.0
command: ["/bin/sh", "-c"]
args:
- |
echo "Mimirtool pod is running. Use 'kubectl exec' to run commands."
# Keep the container running
while true; do
sleep 30
done No newline at end of file
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mimirtool pod runs an infinite loop with a 30-second sleep, which will consume resources indefinitely. Consider adding resource limits (requests/limits for CPU and memory) to this pod specification to prevent it from consuming excessive cluster resources.

Copilot uses AI. Check for mistakes.
Comment on lines +192 to +193
// Complex bash command inside exec needs careful wrapping
metricsCmd := `curl -s -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" -k https://127.0.0.1/metrics`
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metrics command uses 'curl -k' (insecure, skipping TLS verification) to fetch metrics from https://127.0.0.1/metrics. While this is localhost and might be acceptable in a testing/development context, consider documenting why TLS verification is skipped or using a more secure alternative if possible.

Suggested change
// Complex bash command inside exec needs careful wrapping
metricsCmd := `curl -s -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" -k https://127.0.0.1/metrics`
// Use in-cluster CA bundle for TLS verification when fetching metrics
metricsCmd := `curl -s --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://127.0.0.1/metrics`

Copilot uses AI. Check for mistakes.
Copy link
Member

@git-ival git-ival left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @MSpencer87 👋
We should probably not have the different parts of summarize offload to separate module binaries. Can we refactor the collect-profile, export-metrics, and resource-counts modules into internal packages instead? Additionally, it would be good to have flags to optionally disable one or more of these when running dartboard summarize.

We can have separate module binaries for each of them, but the code providing the actual functionality should be contained within internal packages so that summarize and any other modules that need it can call the functions directly without relying on a separate binary.

@git-ival
Copy link
Member

git-ival commented Feb 5, 2026

We don't need to include all of these in this PR (we can iterate later), but I have some more notes after looking at this again:

  • Once this PR is merged, we will need to update the to-be-merged dartboard-choice jenkins job
  • We may want to make use of the standard tar package (https://pkg.go.dev/archive/tar)
  • Ideally the generated "summarize-results" directory would at least be prefixed with the tofu_workspace used in the dart file
    • It may even make sense to put the directory inside of the {terraform_workspace}_config/ directory that is generated by the tofu modules
  • Finally, it would be cool if copy/paste-able instructions for how to visualize/use the collected metrics and profiles was generated automatically and output to the std output
    • A very short how-to for importing the tsdb files into grafana/prometheus and viewing them
    • A very simple how-to for running go tool pprof with the different profiles

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants